Search CORE

Frontiers - Publisher Connector

Springer - Publisher Connector

Comparison of sequence-dependent tiling array normalization approaches

Author: Chung Ho-Ryun
Vingron Martin
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background The detection of enriched DNA or RNA fragments by tiling microarrays has become more and more popular. These microarrays contain a high number of small probes covering genomic loci. However, to achieve high coverage the probe sequences cannot be selected for their hybridization properties. The affinity of the probes towards their targets varies in a sequence-dependent manner. In order to remove this bias a number of approaches have been developed and shown to increase the detection of enriched DNA or RNA fragments. However, these approaches also employ a peak detection algorithm that is different from the one used previously. Thus, it seems possible that the enhancement of detection is due to the peak detection algorithm rather than the sequence-dependent normalization. Results We compared three different sequence-dependent probe level normalization procedures to a naïve sequence-independent normalization technique. In order to achieve maximal comparability, we used the normalized intensity values as input to a single peak detection algorithm. A so-called "spike-in" data set served as benchmark for the performance. We will show that the sequence-dependent normalization procedures do not perform better than the naïve approach, suggesting that the benefit of using these normalization approaches is limited. Furthermore, we will show that the naïve approach does well, because it effectively removes the sequence-dependent component of the measured intensities with the help of the control hybridization experiment. Conclusion Sequence-dependent normalization of microarray data hardly improves the detection of enriched DNA or RNA fragments. The "success" of the sequence-independent naïve approach is only possible due to the control experiment and requires proper scaling of the measured intensities.</p

Springer - Publisher Connector

A joint model of regulatory and metabolic networks

Author: Vingron Martin
Yeang Chen-Hsiang
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Gene regulation and metabolic reactions are two primary activities of life. Although many works have been dedicated to study each system, the coupling between them is less well understood. To bridge this gap, we propose a joint model of gene regulation and metabolic reactions. RESULTS: We integrate regulatory and metabolic networks by adding links specifying the feedback control from the substrates of metabolic reactions to enzyme gene expressions. We adopt two alternative approaches to build those links: inferring the links between metabolites and transcription factors to fit the data or explicitly encoding the general hypotheses of feedback control as links between metabolites and enzyme expressions. A perturbation data is explained by paths in the joint network if the predicted response along the paths is consistent with the observed response. The consistency requirement for explaining the perturbation data imposes constraints on the attributes in the network such as the functions of links and the activities of paths. We build a probabilistic graphical model over the attributes to specify these constraints, and apply an inference algorithm to identify the attribute values which optimally explain the data. The inferred models allow us to 1) identify the feedback links between metabolites and regulators and their functions, 2) identify the active paths responsible for relaying perturbation effects, 3) computationally test the general hypotheses pertaining to the feedback control of enzyme expressions, 4) evaluate the advantage of an integrated model over separate systems. CONCLUSION: The modeling results provide insight about the mechanisms of the coupling between the two systems and possible "design rules" pertaining to enzyme gene regulation. The model can be used to investigate the less well-probed systems and generate consistent hypotheses and predictions for further validation

eScholarship - University of California

Public Library of Science (PLOS)

Evidence for Gene-Specific Rather Than Transcription Rate–Dependent Histone H3 Exchange in Yeast Coding Regions

Author: Gat-Viks Irit
Vingron Martin
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

In eukaryotic organisms, histones are dynamically exchanged independently of DNA replication. Recent reports show that different coding regions differ in their amount of replication-independent histone H3 exchange. The current paradigm is that this histone exchange variability among coding regions is a consequence of transcription rate. Here we put forward the idea that this variability might be also modulated in a gene-specific manner independently of transcription rate. To that end, we study transcription rate–independent replication-independent coding region histone H3 exchange. We term such events relative exchange. Our genome-wide analysis shows conclusively that in yeast, relative exchange is a novel consistent feature of coding regions. Outside of replication, each coding region has a characteristic pattern of histone H3 exchange that is either higher or lower than what was expected by its RNAPII transcription rate alone. Histone H3 exchange in coding regions might be a way to add or remove certain histone modifications that are important for transcription elongation. Therefore, our results that gene-specific coding region histone H3 exchange is decoupled from transcription rate might hint at a new epigenetic mechanism of transcription regulation

Repository for Publications and Research Data

Inferring the paths of somatic evolution in cancer

Author: Misra Navodit
Szczurek Ewa
Vingron Martin
Publication venue
Publication date: 02/08/2017
Field of study

Motivation: Cancer cell genomes acquire several genetic alterations during somatic evolution from a normal cell type. The relative order in which these mutations accumulate and contribute to cell fitness is affected by epistatic interactions. Inferring their evolutionary history is challenging because of the large number of mutations acquired by cancer cells as well as the presence of unknown epistatic interactions. Results: We developed Bayesian Mutation Landscape (BML), a probabilistic approach for reconstructing ancestral genotypes from tumor samples for much larger sets of genes than previously feasible. BML infers the likely sequence of mutation accumulation for any set of genes that is recurrently mutated in tumor samples. When applied to tumor samples from colorectal, glioblastoma, lung and ovarian cancer patients, BML identifies the diverse evolutionary scenarios involved in tumor initiation and progression in greater detail, but broadly in agreement with prior results. Availability and implementation: Source code and all datasets are freely available at bml.molgen.mpg.de Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics onlin

RERO DOC Digital Library

Large scale hierarchical clustering of protein sequences

Author: Krause Antje
Stoye Jens
Vingron Martin
Publication venue: 'American Fisheries Society'
Publication date: 05/07/2007
Field of study

Background: Searching a biological sequence database with a query sequence looking for homologues has become a routine operation in computational biology. In spite of the high degree of sophistication of currently available search routines it is still virtually impossible to identify quickly and clearly a group of sequences that a given query sequence belongs to. Results: We report on our developments in grouping all known protein sequences hierarchically into superfamily and family clusters. Our graph-based algorithms take into account the topology of the sequence space induced by the data itself to construct a biologically meaningful partitioning. We have applied our clustering procedures to a non-redundant set of about 1,000,000 sequences resulting in a hierarchical clustering which is being made available for querying and browsing at http://systers.molgen.mpg.de/. Conclusions: Comparisons with other widely used clustering methods on various data sets show the abilities and strengths of our clustering methods in producing a biologically meaningful grouping of protein sequences

Characteristic differences between the promoters of intron-containing and intronless ribosomal protein genes in yeast

Author: Roepcke Stefan
Vingron Martin
Zhang Jing
Publication venue: BioMed Central
Publication date: 01/10/2008
Field of study

Abstract Background More than two thirds of the highly expressed ribosomal protein (RP) genes in <it>Saccharomyces cerevisiae </it>contain introns, which is in sharp contrast to the genome-wide five percent intron-containing genes. It is well established that introns carry regulatory sequences and that the transcription of RP genes is extensively and coordinately regulated. Here we test the hypotheses that introns are innately associated with heavily transcribed genes and that introns of RP genes contribute regulatory TF binding sequences. Moreover, we investigate whether promoter features are significantly different between intron-containing and intronless RP genes. Results We find that directly measured transcription rates tend to be lower for intron-containing compared to intronless RP genes. We do not observe any specifically enriched sequence motifs in the introns of RP genes other than those of the branch point and the two splice sites. Comparing the promoters of intron-containing and intronless RP genes, we detect differences in number and position of Rap1-binding and IFHL motifs. Moreover, the analysis of the length distribution and the folding free energies suggest that, at least in a sub-population of RP genes, the 5' untranslated sequences are optimized for regulatory function. Conclusion Our results argue against the direct involvement of introns in the regulation of transcription of highly expressed genes. Moreover, systematic differences in motif distributions suggest that RP transcription factors may act differently on intron-containing and intronless gene promoters. Thus, our findings contribute to the decoding of the RP promoter architecture and may fuel the discussion on the evolution of introns.</p

Springer - Publisher Connector

Large scale hierarchical clustering of protein sequences

Author: Krause Antje
Stoye Jens
Vingron Martin
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

BACKGROUND: Searching a biological sequence database with a query sequence looking for homologues has become a routine operation in computational biology. In spite of the high degree of sophistication of currently available search routines it is still virtually impossible to identify quickly and clearly a group of sequences that a given query sequence belongs to. RESULTS: We report on our developments in grouping all known protein sequences hierarchically into superfamily and family clusters. Our graph-based algorithms take into account the topology of the sequence space induced by the data itself to construct a biologically meaningful partitioning. We have applied our clustering procedures to a non-redundant set of about 1,000,000 sequences resulting in a hierarchical clustering which is being made available for querying and browsing at . CONCLUSIONS: Comparisons with other widely used clustering methods on various data sets show the abilities and strengths of our clustering methods in producing a biologically meaningful grouping of protein sequences

Publications at Bielefeld University